339 results found.
Speech
Corpus,
Language Type:
Monolingual
Languages:
Japanese
Availability:
Freely Available
License:
Size:
3.5 GByte Production Status:
Existing-used
Use:
Speech Synthesis
-
Paper title:Cross-lingual Text-To-Speech Synthesis via Domain Adaptation and Perceptual Similarity Regression in Speaker Space
-
Paper track:7.14 Cross-lingual and multilingual aspects in spe/Poster Presentation
-
Paper status:Accept - Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Detai Xin | JVS (Japanese versatile speech) corpus | /N |
Documentation:
English and Japanese documentation available
Speech/Written
Treebank,
Language Type:
Monolingual
Languages:
Japanese
Availability:
Freely Available
License:
Web-accessible and downloadable
Size:
30,000 sentences Production Status:
Newly created-in progress
Use:
for linguistic research, language learning, and semantic role labeling
-
Paper title:Constructing Web-Accessible Semantic Role Labels and Frames for Japanese as Additions to the NPCMJ Parsed Corpus
-
Paper track:Evaluation/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Koichi Takeuchi | NPCMJ | /N |
Documentation:
http://npcmj.ninjal.ac.jp/wp-content/uploads/2019/05/npcmj_annotation_manual_jp_201904.pdf (Japanese)
Written
Corpus,
Language Type:
Bilingual
Languages:
English Japanese
Availability:
Freely Available
License:
Research purposes only
Size:
8763995 sentences Production Status:
Newly created-finished
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:JParaCrawl: A Large Scale Web-Based English-Japanese Parallel Corpus
-
Paper track:Written/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Makoto Morishita | JParaCrawl | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Monolingual
Languages:
Japanese
Availability:
From Data Center(s)
License:
Size:
661 hours Production Status:
Existing-used
Use:
Speech Synthesis
-
Paper title:DNN-based Speech Synthesis Using Abundant Tags of Spontaneous Speech Corpus
-
Paper track:Speech/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Tomoki Koriyama | Corpus of Spontaneous Japanese | /N |
Documentation:
None
Written
Corpus,
Language Type:
Multilingual
Languages:
English Farsi French German Japanese
Availability:
Freely Available
License:
Size:
4.5 MByte Production Status:
Existing-updated
Use:
Document Classification, Text categorisation
-
Paper title:Multi-class Multilingual Classification of Wikipedia Articles Using Extended Named Entity Tag Set
-
Paper track:Evaluation/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Hassan S. Shavarani | Shinra-5LDS Dataset | /N |
Documentation:
None
Speech/Written
Corpus,
Language Type:
Monolingual
Languages:
Arabic Catalan Chinese Dutch Estonian French German Indonesian Italian Japanese Latvian Mongolian Persian Portuguese Russian Slovenian Spanish Swedish Tamil Turkish Welsh
Availability:
Freely Available
License:
CC0
Size:
2880 hoursProduction Status:
Newly created-in progress
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:CoVoST 2 and Massively Multilingual Speech Translation
-
Paper track:12.1 Spoken machine translation/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Juan Pino | CoVoST 2 | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Multilingual
Languages:
Cantonese Indonesian Japanese Kazakh Korean Mandarin Russian Tibetan Uyghur Vietnamese
Availability:
From Owner
License:
Speechocean and Center for Speech and LanguageTechnologies (Tsinghua University)
Size:
None GByteProduction Status:
Existing-used
Use:
Language Identification
-
Paper title:Language recognition on unknown conditions: the LORIA-Inria-MULTISPEECH system for AP20-OLR Challenge
-
Paper track:14.4 Oriental Langauge Recognition/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Raphaël Duroselle | Oriental Language Recogntion challenge 2020 corpus | /N |
Documentation:
Evaluation plan paper
Speech
Corpus,
Language Type:
Multilingual
Languages:
Arabic English Farsi French German Hindi Japanese Korean Mandarin Russian Spanish Tamil Vietnamese
Availability:
From Owner
License:
LDC
Size:
46 hoursProduction Status:
Existing-used
Use:
Language Identification
-
Paper title:Modeling and training strategies for language recognition systems
-
Paper track:4.1 Language identification and verification, lang/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Raphaël Duroselle | 2003 NIST Language Recognition Evaluation | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Multilingual
Languages:
English Hindi Japanese Korean Mandarin Spanish
Tamil
Availability:
From Owner
License:
LDC
Size:
73 hoursProduction Status:
Existing-used
Use:
Language Identification
-
Paper title:Modeling and training strategies for language recognition systems
-
Paper track:4.1 Language identification and verification, lang/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Raphaël Duroselle | 2005 NIST Language Recognition Evaluation | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Multilingual
Languages:
Arabic Bengali Dari English German Hindi Iranian Persian Japanese Korean Mandarin Chinese Persian Russian Spansih Standard Arabic Tamil Thai Vietnamese Yue Chinese
Availability:
From Owner
License:
LDC
Size:
66 hoursProduction Status:
Existing-used
Use:
Language Identification
-
Paper title:Modeling and training strategies for language recognition systems
-
Paper track:4.1 Language identification and verification, lang/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Raphaël Duroselle | 2007 NIST Language Recognition Evaluation Test Set | /N |
Documentation:
None




